By: Suraj Honkamble.
import pandas as pd
pd.options.display.max_columns=50
import numpy as np
import seaborn as sns
sns.set_style('darkgrid')
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
import plotly.express as px
import plotly.graph_objects as go
df=pd.read_csv('D:\\DATA SCIENCE Internship with CodersCave\\Data\\Covid Data.csv')
df.head()
| USMER | MEDICAL_UNIT | SEX | PATIENT_TYPE | DATE_DIED | INTUBED | PNEUMONIA | AGE | PREGNANT | DIABETES | COPD | ASTHMA | INMSUPR | HIPERTENSION | OTHER_DISEASE | CARDIOVASCULAR | OBESITY | RENAL_CHRONIC | TOBACCO | CLASIFFICATION_FINAL | ICU | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 1 | 1 | 1 | 03/05/2020 | 97 | 1 | 65 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 97 |
| 1 | 2 | 1 | 2 | 1 | 03/06/2020 | 97 | 1 | 72 | 97 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 5 | 97 |
| 2 | 2 | 1 | 2 | 2 | 09/06/2020 | 1 | 2 | 55 | 97 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 2 |
| 3 | 2 | 1 | 1 | 1 | 12/06/2020 | 97 | 2 | 53 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 7 | 97 |
| 4 | 2 | 1 | 2 | 1 | 21/06/2020 | 97 | 2 | 68 | 97 | 1 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 97 |
In the Boolean features, 1 means "yes" and 2 means "no" and values as 97 and 99 are missing data.
df.shape
(1048575, 21)
df.columns
Index(['USMER', 'MEDICAL_UNIT', 'SEX', 'PATIENT_TYPE', 'DATE_DIED', 'INTUBED',
'PNEUMONIA', 'AGE', 'PREGNANT', 'DIABETES', 'COPD', 'ASTHMA', 'INMSUPR',
'HIPERTENSION', 'OTHER_DISEASE', 'CARDIOVASCULAR', 'OBESITY',
'RENAL_CHRONIC', 'TOBACCO', 'CLASIFFICATION_FINAL', 'ICU'],
dtype='object')
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1048575 entries, 0 to 1048574 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 USMER 1048575 non-null int64 1 MEDICAL_UNIT 1048575 non-null int64 2 SEX 1048575 non-null int64 3 PATIENT_TYPE 1048575 non-null int64 4 DATE_DIED 1048575 non-null object 5 INTUBED 1048575 non-null int64 6 PNEUMONIA 1048575 non-null int64 7 AGE 1048575 non-null int64 8 PREGNANT 1048575 non-null int64 9 DIABETES 1048575 non-null int64 10 COPD 1048575 non-null int64 11 ASTHMA 1048575 non-null int64 12 INMSUPR 1048575 non-null int64 13 HIPERTENSION 1048575 non-null int64 14 OTHER_DISEASE 1048575 non-null int64 15 CARDIOVASCULAR 1048575 non-null int64 16 OBESITY 1048575 non-null int64 17 RENAL_CHRONIC 1048575 non-null int64 18 TOBACCO 1048575 non-null int64 19 CLASIFFICATION_FINAL 1048575 non-null int64 20 ICU 1048575 non-null int64 dtypes: int64(20), object(1) memory usage: 168.0+ MB
df.isna().sum()
USMER 0 MEDICAL_UNIT 0 SEX 0 PATIENT_TYPE 0 DATE_DIED 0 INTUBED 0 PNEUMONIA 0 AGE 0 PREGNANT 0 DIABETES 0 COPD 0 ASTHMA 0 INMSUPR 0 HIPERTENSION 0 OTHER_DISEASE 0 CARDIOVASCULAR 0 OBESITY 0 RENAL_CHRONIC 0 TOBACCO 0 CLASIFFICATION_FINAL 0 ICU 0 dtype: int64
df.nunique()
USMER 2 MEDICAL_UNIT 13 SEX 2 PATIENT_TYPE 2 DATE_DIED 401 INTUBED 4 PNEUMONIA 3 AGE 121 PREGNANT 4 DIABETES 3 COPD 3 ASTHMA 3 INMSUPR 3 HIPERTENSION 3 OTHER_DISEASE 3 CARDIOVASCULAR 3 OBESITY 3 RENAL_CHRONIC 3 TOBACCO 3 CLASIFFICATION_FINAL 7 ICU 4 dtype: int64
cat=['USMER', 'MEDICAL_UNIT', 'SEX', 'PATIENT_TYPE', 'INTUBED',
'PNEUMONIA', 'PREGNANT', 'DIABETES', 'COPD', 'ASTHMA', 'INMSUPR',
'HIPERTENSION', 'OTHER_DISEASE', 'CARDIOVASCULAR', 'OBESITY',
'RENAL_CHRONIC', 'TOBACCO', 'CLASIFFICATION_FINAL', 'ICU']
for i in cat:
print("Column Name : {}\nTotal unique values: {}\nUnique values: {}".format(i, df[i].nunique(), df[i].unique()))
print("**--**"*10)
Column Name : USMER Total unique values: 2 Unique values: [2 1] **--****--****--****--****--****--****--****--****--****--** Column Name : MEDICAL_UNIT Total unique values: 13 Unique values: [ 1 2 3 4 5 6 7 8 9 10 11 12 13] **--****--****--****--****--****--****--****--****--****--** Column Name : SEX Total unique values: 2 Unique values: [1 2] **--****--****--****--****--****--****--****--****--****--** Column Name : PATIENT_TYPE Total unique values: 2 Unique values: [1 2] **--****--****--****--****--****--****--****--****--****--** Column Name : INTUBED Total unique values: 4 Unique values: [97 1 2 99] **--****--****--****--****--****--****--****--****--****--** Column Name : PNEUMONIA Total unique values: 3 Unique values: [ 1 2 99] **--****--****--****--****--****--****--****--****--****--** Column Name : PREGNANT Total unique values: 4 Unique values: [ 2 97 98 1] **--****--****--****--****--****--****--****--****--****--** Column Name : DIABETES Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : COPD Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : ASTHMA Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : INMSUPR Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : HIPERTENSION Total unique values: 3 Unique values: [ 1 2 98] **--****--****--****--****--****--****--****--****--****--** Column Name : OTHER_DISEASE Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : CARDIOVASCULAR Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : OBESITY Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : RENAL_CHRONIC Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : TOBACCO Total unique values: 3 Unique values: [ 2 1 98] **--****--****--****--****--****--****--****--****--****--** Column Name : CLASIFFICATION_FINAL Total unique values: 7 Unique values: [3 5 7 6 1 2 4] **--****--****--****--****--****--****--****--****--****--** Column Name : ICU Total unique values: 4 Unique values: [97 2 1 99] **--****--****--****--****--****--****--****--****--****--**
97,98, 99 are corresponds to missing values. Lets replace 97,98, 99 with np.nan.¶multi=['INTUBED', 'PNEUMONIA', 'PREGNANT', 'DIABETES', 'COPD', 'ASTHMA', 'INMSUPR',
'HIPERTENSION', 'OTHER_DISEASE', 'CARDIOVASCULAR', 'OBESITY',
'RENAL_CHRONIC', 'TOBACCO', 'ICU']
Above Columns contains 97,98 and 99 values lets replcae them
for i in multi:
df[i].replace({99:np.nan, 97:np.nan, 98:np.nan}, inplace=True)
print("Column Name : {}\nTotal unique values: {}\nUnique values: {}".format(i, df[i].nunique(), df[i].unique()))
print("**--**"*10)
Column Name : INTUBED Total unique values: 2 Unique values: [nan 1. 2.] **--****--****--****--****--****--****--****--****--****--** Column Name : PNEUMONIA Total unique values: 2 Unique values: [ 1. 2. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : PREGNANT Total unique values: 2 Unique values: [ 2. nan 1.] **--****--****--****--****--****--****--****--****--****--** Column Name : DIABETES Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : COPD Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : ASTHMA Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : INMSUPR Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : HIPERTENSION Total unique values: 2 Unique values: [ 1. 2. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : OTHER_DISEASE Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : CARDIOVASCULAR Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : OBESITY Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : RENAL_CHRONIC Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : TOBACCO Total unique values: 2 Unique values: [ 2. 1. nan] **--****--****--****--****--****--****--****--****--****--** Column Name : ICU Total unique values: 2 Unique values: [nan 2. 1.] **--****--****--****--****--****--****--****--****--****--**
df.isna().sum()
USMER 0 MEDICAL_UNIT 0 SEX 0 PATIENT_TYPE 0 DATE_DIED 0 INTUBED 855869 PNEUMONIA 16003 AGE 0 PREGNANT 527265 DIABETES 3338 COPD 3003 ASTHMA 2979 INMSUPR 3404 HIPERTENSION 3104 OTHER_DISEASE 5045 CARDIOVASCULAR 3076 OBESITY 3032 RENAL_CHRONIC 3006 TOBACCO 3220 CLASIFFICATION_FINAL 0 ICU 856032 dtype: int64
def test_result(result):
if (result==1 or result==3 or result==3):
return 1
else:
return 2
df['CLASIFFICATION_FINAL']=df['CLASIFFICATION_FINAL'].apply(test_result)
df['CLASIFFICATION_FINAL'].unique()
array([1, 2], dtype=int64)
df.duplicated().sum()
837028
"9999-99-99" which indicates person not died. Using this column create a new column "Died", if the DIED_DATE is "9999-99-99" then return 2(Not Died) or else return 1(Died).¶def died(dod):
if dod=='9999-99-99':
return 2
else:
return 1
df['DEAD'] = df['DATE_DIED'].apply(died)
df['DEAD'].unique()
array([1, 2], dtype=int64)
df['DATE_DIED'].replace({'9999-99-99':np.nan}, inplace=True)
df['DATE_DIED']=pd.to_datetime(df['DATE_DIED'])
df.dtypes
USMER int64 MEDICAL_UNIT int64 SEX int64 PATIENT_TYPE int64 DATE_DIED datetime64[ns] INTUBED float64 PNEUMONIA float64 AGE int64 PREGNANT float64 DIABETES float64 COPD float64 ASTHMA float64 INMSUPR float64 HIPERTENSION float64 OTHER_DISEASE float64 CARDIOVASCULAR float64 OBESITY float64 RENAL_CHRONIC float64 TOBACCO float64 CLASIFFICATION_FINAL int64 ICU float64 DEAD int64 dtype: object
plt.figure(figsize=(12,5))
sns.kdeplot(df['AGE'], shade=True, shade_lowest=False)
plt.title("Distribution of Patients Age", fontsize=18, color='green');
df.hist(figsize=(20,20));
plt.figure(figsize=(20,10))
sns.heatmap(df.corr(), annot=True, cmap='Blues');
for i in df.columns:
if (i!='AGE' and i!='DATE_DIED'):
df[i]=df[i].astype('O')
else:
pass
df.dtypes
USMER object MEDICAL_UNIT object SEX object PATIENT_TYPE object DATE_DIED datetime64[ns] INTUBED object PNEUMONIA object AGE int64 PREGNANT object DIABETES object COPD object ASTHMA object INMSUPR object HIPERTENSION object OTHER_DISEASE object CARDIOVASCULAR object OBESITY object RENAL_CHRONIC object TOBACCO object CLASIFFICATION_FINAL object ICU object DEAD object dtype: object
df['USMER'].replace({1:"Primary Treatment", 2:"No Primary Treatment"}, inplace=True)
df['SEX'].replace({1:"Female", 2:"Male"}, inplace=True)
df['PATIENT_TYPE'].replace({1:"Sent Home", 2:"Hospitalized"}, inplace=True)
df['INTUBED'].replace({1:"On Ventilator", 2:"Not On Ventilator"}, inplace=True)
df['PNEUMONIA'].replace({1:"Pneumonia Patient", 2:"Not Pneumonia Patient"}, inplace=True)
df['PREGNANT'].replace({1:"Pregnent", 2:"Not Pregnent"}, inplace=True)
df['DIABETES'].replace({1:"Diabetic Patient", 2:"Non Diabetic Patient"}, inplace=True)
df['COPD'].replace({1:"COP Patient", 2:"Non COP Patient"}, inplace=True)
df['ASTHMA'].replace({1:"Asthama Patient", 2:"Non Asthama Patient"}, inplace=True)
df['INMSUPR'].replace({1:"Weak Immune System", 2:"Strong Immune System"}, inplace=True)
df['HIPERTENSION'].replace({1:"Hipertension Patient", 2:"Non Hipertension Patience"}, inplace=True)
df['OTHER_DISEASE'].replace({1:"Other Diseases", 2:"No Other Diseases"}, inplace=True)
df['CARDIOVASCULAR'].replace({1:"Heart Patient", 2:"Not Heart Patient"}, inplace=True)
df['OBESITY'].replace({1:"Obese", 2:"Not Obese"}, inplace=True)
df['RENAL_CHRONIC'].replace({1:"Chronic", 2:"Not Chronic"}, inplace=True)
df['TOBACCO'].replace({1:"Consume Tobacco", 2:"Not Consume Tobacco"}, inplace=True)
df['CLASIFFICATION_FINAL'].replace({1:"Infected", 2:"Non-Inffected"}, inplace=True)
df['ICU'].replace({1:"In ICU", 2:"Not in ICU"}, inplace=True)
df['DEAD'].replace({1:"Died", 2:"Not Died"}, inplace=True)
plt.figure(figsize=(12,4))
sns.boxplot(x='CLASIFFICATION_FINAL', y='AGE', data=df)
plt.title("Age vs Covid Results", fontsize=18, color='green');
patient=pd.crosstab(index=df['USMER'], columns=df['DEAD'])
patient.reset_index(inplace=True)
patient
| DEAD | USMER | Died | Not Died |
|---|---|---|---|
| 0 | No Primary Treatment | 33787 | 629116 |
| 1 | Primary Treatment | 43155 | 342517 |
fig=px.bar(data_frame=patient, x='USMER', y=['Died'], text_auto=True, color='USMER',
title="Count of Patients who died after Taking Primary Treatment" )
fig.show()
43k People died after taking the primary treatment and 33k People died with no primary treatment.
patient=pd.crosstab(index=df['MEDICAL_UNIT'], columns=df['DEAD'])
patient.reset_index(inplace=True)
patient
| DEAD | MEDICAL_UNIT | Died | Not Died |
|---|---|---|---|
| 0 | 1 | 5 | 146 |
| 1 | 2 | 5 | 164 |
| 2 | 3 | 1492 | 17683 |
| 3 | 4 | 39905 | 274500 |
| 4 | 5 | 607 | 6637 |
| 5 | 6 | 5790 | 34794 |
| 6 | 7 | 40 | 851 |
| 7 | 8 | 1171 | 9228 |
| 8 | 9 | 1369 | 36747 |
| 9 | 10 | 1468 | 6405 |
| 10 | 11 | 409 | 5168 |
| 11 | 12 | 24620 | 578375 |
| 12 | 13 | 61 | 935 |
fig=px.bar(data_frame=patient, x='MEDICAL_UNIT', y='Died', text_auto=True, color='MEDICAL_UNIT',
title="Count of Died Patients In Institution Level" )
fig.show()
More Patients died in Medical Unit=4 then in Medical Unit=12
inff_patient=pd.crosstab(index=df['SEX'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | SEX | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Female | 181830 | 343234 |
| 1 | Male | 208298 | 315213 |
died_patient=pd.crosstab(index=df['SEX'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | SEX | Died | Not Died |
|---|---|---|---|
| 0 | Female | 27402 | 497662 |
| 1 | Male | 49540 | 473971 |
patients=pd.merge(inff_patient, died_patient, on='SEX', how='inner')
patients
| SEX | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Female | 181830 | 343234 | 27402 | 497662 |
| 1 | Male | 208298 | 315213 | 49540 | 473971 |
fig=px.bar(data_frame=patients, x='SEX', y=['Infected','Died'], text_auto=True,
title="Gender wise Count of Inffected & Died Patients" )
fig.show()
Male and Female both are getting Inffected with same rate.
Male Deaths are more than the Female.
patient=pd.crosstab(index=df['PATIENT_TYPE'], columns=df['DEAD'])
patient.reset_index(inplace=True)
patient
| DEAD | PATIENT_TYPE | Died | Not Died |
|---|---|---|---|
| 0 | Hospitalized | 70066 | 129965 |
| 1 | Sent Home | 6876 | 841668 |
fig=px.bar(data_frame=patient, x='PATIENT_TYPE', y=['Died','Not Died'], text_auto=True,
title="Count of Patients died by Type" )
fig.show()
Most of the Hospitalized people died.
patient=pd.crosstab(index=df['INTUBED'], columns=df['DEAD'])
patient.reset_index(inplace=True)
patient
| DEAD | INTUBED | Died | Not Died |
|---|---|---|---|
| 0 | Not On Ventilator | 41601 | 117449 |
| 1 | On Ventilator | 26381 | 7275 |
fig=px.bar(data_frame=patient, x='INTUBED', y=['Died','Not Died'], text_auto=True,
title="Count of Patients died after keeping them on Ventilator" )
fig.show()
Many Patients died without availability of Ventilators. And when the patient went on Ventilators the chances of death is high.
inff_patient=pd.crosstab(index=df['PNEUMONIA'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | PNEUMONIA | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Not Pneumonia Patient | 305211 | 587323 |
| 1 | Pneumonia Patient | 84913 | 55125 |
died_patient=pd.crosstab(index=df['PNEUMONIA'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | PNEUMONIA | Died | Not Died |
|---|---|---|---|
| 0 | Not Pneumonia Patient | 22285 | 870249 |
| 1 | Pneumonia Patient | 53923 | 86115 |
patients=pd.merge(inff_patient, died_patient, on='PNEUMONIA', how='inner')
patients
| PNEUMONIA | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Not Pneumonia Patient | 305211 | 587323 | 22285 | 870249 |
| 1 | Pneumonia Patient | 84913 | 55125 | 53923 | 86115 |
fig=px.bar(data_frame=patients, x='PNEUMONIA', y=['Infected','Died'], text_auto=True,
title="Count of Pneumonia Patients got inffected and died by Corona Virus" )
fig.show()
About 70% Corona Inffected Pneumonia Patients died.
inff_patient=pd.crosstab(index=df['PREGNANT'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | PREGNANT | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Not Pregnent | 177693 | 335486 |
| 1 | Pregnent | 2754 | 5377 |
died_patient=pd.crosstab(index=df['PREGNANT'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | PREGNANT | Died | Not Died |
|---|---|---|---|
| 0 | Not Pregnent | 27246 | 485933 |
| 1 | Pregnent | 89 | 8042 |
patients=pd.merge(inff_patient, died_patient, on='PREGNANT', how='inner')
patients
| PREGNANT | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Not Pregnent | 177693 | 335486 | 27246 | 485933 |
| 1 | Pregnent | 2754 | 5377 | 89 | 8042 |
fig=px.bar(data_frame=patients, x='PREGNANT', y=['Infected','Died'], text_auto=True,
title="Count of Inffected & Death Patients by Pregnency" )
fig.show()
Very less paregnent women got inffected and the count of death is also very low.
inff_patient=pd.crosstab(index=df['DIABETES'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | DIABETES | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Diabetic Patient | 61445 | 63544 |
| 1 | Non Diabetic Patient | 327276 | 592972 |
died_patient=pd.crosstab(index=df['DIABETES'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | DIABETES | Died | Not Died |
|---|---|---|---|
| 0 | Diabetic Patient | 28265 | 96724 |
| 1 | Non Diabetic Patient | 47946 | 872302 |
patients=pd.merge(inff_patient, died_patient, on='DIABETES', how='inner')
patients
| DIABETES | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Diabetic Patient | 61445 | 63544 | 28265 | 96724 |
| 1 | Non Diabetic Patient | 327276 | 592972 | 47946 | 872302 |
fig=px.bar(data_frame=patients, x='DIABETES', y=['Infected','Died'], text_auto=True,
title="Gender wise Count of Inffected Patients" )
fig.show()
inff_patient=pd.crosstab(index=df['COPD'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | COPD | Infected | Non-Inffected |
|---|---|---|---|
| 0 | COP Patient | 6048 | 9014 |
| 1 | Non COP Patient | 382806 | 647704 |
died_patient=pd.crosstab(index=df['COPD'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | COPD | Died | Not Died |
|---|---|---|---|
| 0 | COP Patient | 4021 | 11041 |
| 1 | Non COP Patient | 72210 | 958300 |
patients=pd.merge(inff_patient, died_patient, on='COPD', how='inner')
patients
| COPD | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | COP Patient | 6048 | 9014 | 4021 | 11041 |
| 1 | Non COP Patient | 382806 | 647704 | 72210 | 958300 |
fig=px.bar(data_frame=patients, x='COPD', y=['Infected','Died'], text_auto=True,
title="Count COP Disease people getting inffected and died" )
fig.show()
Death rate for COP disease people after getting covid inffection is very high.almost 70% Patients died.
inff_patient=pd.crosstab(index=df['ASTHMA'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | ASTHMA | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Asthama Patient | 10382 | 21190 |
| 1 | Non Asthama Patient | 378478 | 635546 |
died_patient=pd.crosstab(index=df['ASTHMA'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | ASTHMA | Died | Not Died |
|---|---|---|---|
| 0 | Asthama Patient | 1480 | 30092 |
| 1 | Non Asthama Patient | 74758 | 939266 |
patients=pd.merge(inff_patient, died_patient, on='ASTHMA', how='inner')
patients
| ASTHMA | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Asthama Patient | 10382 | 21190 | 1480 | 30092 |
| 1 | Non Asthama Patient | 378478 | 635546 | 74758 | 939266 |
fig=px.bar(data_frame=patients, x='ASTHMA', y=['Infected','Died'], text_auto=True,
title="Count of Inffected and died Asthma patients" )
fig.show()
10% Asthma Patient died after covid inffection.
inff_patient=pd.crosstab(index=df['INMSUPR'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | INMSUPR | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Strong Immune System | 383990 | 647011 |
| 1 | Weak Immune System | 4726 | 9444 |
died_patient=pd.crosstab(index=df['INMSUPR'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | INMSUPR | Died | Not Died |
|---|---|---|---|
| 0 | Strong Immune System | 73569 | 957432 |
| 1 | Weak Immune System | 2618 | 11552 |
patients=pd.merge(inff_patient, died_patient, on='INMSUPR', how='inner')
patients
| INMSUPR | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Strong Immune System | 383990 | 647011 | 73569 | 957432 |
| 1 | Weak Immune System | 4726 | 9444 | 2618 | 11552 |
fig=px.bar(data_frame=patients, x='INMSUPR', y=['Infected','Died'], text_auto=True,
title="people with low Immune System inffected and died" )
fig.show()
55% Low immune system people died afted covid affection.
inff_patient=pd.crosstab(index=df['HIPERTENSION'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | HIPERTENSION | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Hipertension Patient | 75946 | 86783 |
| 1 | Non Hipertension Patience | 312833 | 569909 |
died_patient=pd.crosstab(index=df['HIPERTENSION'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | HIPERTENSION | Died | Not Died |
|---|---|---|---|
| 0 | Hipertension Patient | 32061 | 130668 |
| 1 | Non Hipertension Patience | 44191 | 838551 |
patients=pd.merge(inff_patient, died_patient, on='HIPERTENSION', how='inner')
patients
| HIPERTENSION | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Hipertension Patient | 75946 | 86783 | 32061 | 130668 |
| 1 | Non Hipertension Patience | 312833 | 569909 | 44191 | 838551 |
fig=px.bar(data_frame=patients, x='HIPERTENSION', y=['Infected','Died'], text_auto=True,
title="Count of Hypertensed people died after corona inffection" )
fig.show()
About 40% Inffected Hypertension people died.
inff_patient=pd.crosstab(index=df['CARDIOVASCULAR'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | CARDIOVASCULAR | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Heart Patient | 8423 | 12346 |
| 1 | Not Heart Patient | 380350 | 644380 |
died_patient=pd.crosstab(index=df['CARDIOVASCULAR'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | CARDIOVASCULAR | Died | Not Died |
|---|---|---|---|
| 0 | Heart Patient | 4435 | 16334 |
| 1 | Not Heart Patient | 71774 | 952956 |
patients=pd.merge(inff_patient, died_patient, on='CARDIOVASCULAR', how='inner')
patients
| CARDIOVASCULAR | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Heart Patient | 8423 | 12346 | 4435 | 16334 |
| 1 | Not Heart Patient | 380350 | 644380 | 71774 | 952956 |
fig=px.bar(data_frame=patients, x='CARDIOVASCULAR', y=['Infected','Died'], text_auto=True,
title="Count of patience with heart disease got inffection and died" )
fig.show()
50% Cardiovescular related dieases people died after covid inffection.
inff_patient=pd.crosstab(index=df['OBESITY'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | OBESITY | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Not Obese | 316391 | 569336 |
| 1 | Obese | 72413 | 87403 |
died_patient=pd.crosstab(index=df['OBESITY'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | OBESITY | Died | Not Died |
|---|---|---|---|
| 0 | Not Obese | 58933 | 826794 |
| 1 | Obese | 17294 | 142522 |
patients=pd.merge(inff_patient, died_patient, on='OBESITY', how='inner')
patients
| OBESITY | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Not Obese | 316391 | 569336 | 58933 | 826794 |
| 1 | Obese | 72413 | 87403 | 17294 | 142522 |
fig=px.bar(data_frame=patients, x='OBESITY', y=['Infected','Died'], text_auto=True,
title="Count of Hypertensed people died after corona inffection" )
fig.show()
About 20% People with Obesity died due to covid inffection.
inff_patient=pd.crosstab(index=df['RENAL_CHRONIC'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | RENAL_CHRONIC | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Chronic | 7820 | 11084 |
| 1 | Not Chronic | 380994 | 645671 |
died_patient=pd.crosstab(index=df['RENAL_CHRONIC'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | RENAL_CHRONIC | Died | Not Died |
|---|---|---|---|
| 0 | Chronic | 5707 | 13197 |
| 1 | Not Chronic | 70532 | 956133 |
patients=pd.merge(inff_patient, died_patient, on='RENAL_CHRONIC', how='inner')
patients
| RENAL_CHRONIC | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Chronic | 7820 | 11084 | 5707 | 13197 |
| 1 | Not Chronic | 380994 | 645671 | 70532 | 956133 |
fig=px.bar(data_frame=patients, x='RENAL_CHRONIC', y=['Infected','Died'], text_auto=True,
title="Count of Chronic Renel Disease people died after corona inffection" )
fig.show()
75% Chronical diseases People died after corona virus inffection.
inff_patient=pd.crosstab(index=df['TOBACCO'], columns=df['CLASIFFICATION_FINAL'])
inff_patient.reset_index(inplace=True)
inff_patient
| CLASIFFICATION_FINAL | TOBACCO | Infected | Non-Inffected |
|---|---|---|---|
| 0 | Consume Tobacco | 28624 | 55752 |
| 1 | Not Consume Tobacco | 360106 | 600873 |
died_patient=pd.crosstab(index=df['TOBACCO'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | TOBACCO | Died | Not Died |
|---|---|---|---|
| 0 | Consume Tobacco | 6596 | 77780 |
| 1 | Not Consume Tobacco | 69621 | 891358 |
patients=pd.merge(inff_patient, died_patient, on='TOBACCO', how='inner')
patients
| TOBACCO | Infected | Non-Inffected | Died | Not Died | |
|---|---|---|---|---|---|
| 0 | Consume Tobacco | 28624 | 55752 | 6596 | 77780 |
| 1 | Not Consume Tobacco | 360106 | 600873 | 69621 | 891358 |
fig=px.bar(data_frame=patients, x='TOBACCO', y=['Infected','Died'], text_auto=True,
title="Count of Hypertensed people died after corona inffection" )
fig.show()
Only 2% Tobacco consuming people died after corona virus inffection.
died_patient=pd.crosstab(index=df['ICU'], columns=df['DEAD'])
died_patient.reset_index(inplace=True)
died_patient
| DEAD | ICU | Died | Not Died |
|---|---|---|---|
| 0 | In ICU | 8195 | 8663 |
| 1 | Not in ICU | 59775 | 115910 |
fig=px.bar(data_frame=died_patient, x='ICU', y=['Not Died','Died'], text_auto=True,
title="Patients died in ICU" )
fig.show()
About 99% Patients who admitted to ICU are died. Most number of Patients died outside the ICU.
df['Date']=df['DATE_DIED'].dt.to_period('M')
deaths=pd.crosstab(index=df['Date'], columns=df['DEAD'])
deaths.reset_index(inplace=True)
deaths
| DEAD | Date | Died |
|---|---|---|
| 0 | 2020-01 | 2738 |
| 1 | 2020-02 | 2613 |
| 2 | 2020-03 | 2819 |
| 3 | 2020-04 | 8192 |
| 4 | 2020-05 | 16621 |
| 5 | 2020-06 | 17888 |
| 6 | 2020-07 | 12401 |
| 7 | 2020-08 | 2850 |
| 8 | 2020-09 | 2631 |
| 9 | 2020-10 | 2609 |
| 10 | 2020-11 | 2552 |
| 11 | 2020-12 | 2711 |
| 12 | 2021-01 | 55 |
| 13 | 2021-02 | 43 |
| 14 | 2021-03 | 34 |
| 15 | 2021-04 | 156 |
| 16 | 2021-05 | 2 |
| 17 | 2021-06 | 5 |
| 18 | 2021-07 | 3 |
| 19 | 2021-08 | 4 |
| 20 | 2021-09 | 2 |
| 21 | 2021-10 | 5 |
| 22 | 2021-11 | 3 |
| 23 | 2021-12 | 5 |
Any age People are getting inffected by the Corona Virus.More People died after taking the primary treatment as compared to the people who has not taken any primary treatment and More Patients died in Medical Unit=4 then in Medical Unit=12.Male and Female both are getting Inffected with same rate but the Male Deaths are more than the Female.Hospitalized people died.without availability of Ventilators. And when the patient went on Ventilators the chances of death is high.70% Corona Inffected Pneumonia Patients died.Less amount Pregnent womens got inffected and the death count is also very low.40% Diabetic Patients died after getting inffected by Corona VIrus.COP disease people after getting covid inffection is very high, almost 70% Patients died.10% Asthma Patient died after covid inffection.55% Low immune system people died afted covid affection. Very less percentage of people with strong Immune System died after getting infected.40% Inffected Hypertension people died.50% Cardiovescular related dieases people died after covid inffection.20% People with Obesity, died due to covid inffection.75% Chronical diseases People died after corona virus inffection.2% Tobacco consuming people died after corona virus inffection.99% Patients who admitted to ICU are died. Most number of Patients died outside the ICU.May, June and July 2020 Covid on its peak hence more Deaths recorded in these 3 Months.